Key node identification for a network topology using hierarchical comprehensive importance coefficients

Key nodes are similar to important hubs in a network structure, which can directly determine the robustness and stability of the network. By effectively identifying and protecting these critical nodes, the robustness of the network can be improved, making it more resistant to external interference and attacks. There are various topology analysis methods for a given network, but key node identification methods often focus on either local attributes or global attributes. Designing an algorithm that combines both attributes can improve the accuracy of key node identification. In this paper, the constraint coefficient of a weakly connected network is calculated based on the Salton indicator, and a hierarchical tenacity global coefficient is obtained by an improved K-Shell decomposition method. Then, a hierarchical comprehensive key node identification algorithm is proposed which can comprehensively indicate the local and global attributes of the network nodes. Experimental results on real network datasets show that the proposed algorithm outperforms the other classic algorithms in terms of connectivity, average remaining edges, sensitivity and monotonicity.

In the method based on graph entropy theory, Qiao et al. 20 built a model that decomposes a graph into subgraphs and then computed the entropies of neighboring nodes.Furthermore, Hu et al. 21used this method to identify key nodes, and experiments showed that this method can be applied to various types of complex networks.Lin et al. 22 used both the information entropy weight method and the analytic hierarchy process to measure the node importance.
In recent years, some methods based on multi-attribute combination have also been proposed.TOPSIS 23 combines multiple centralities with equal weights to evaluate the importance of nodes, which may not be practical.To deal with this problem, an ideal solution ranking weighted algorithm proposed by Hu et al. 24 assigns different weights to individual centralities.Yang et al. 25 proposed a dynamic TOPSIS weighted ranking method based on the infection recovery model and gray correlation analysis, which can dynamically adjust the weight of each centrality.In addition, Sun et al. 26 compared different methodologies such as influential node ranking and influence maximization to identify key nodes in social networks and introduced Shapley centrality as a potentially more general approach.Zhang et al. 27 proposed a new semi-local centrality metric based on the relative change in the average shortest path, enhancing the efficiency of identifying influential nodes.Zhu et al. 28 introduced a gravity model centrality method, termed HVGC, that outperforms existing methods in evaluating node importance in complex networks.Ren et al. 29 discussed methods that consider multiplex influences to identify key nodes in complex networks.Zhao et al. 30 presented a novel algorithm called NEGM that excels in measuring the relative importance of nodes in various network types, integrating network embedding with a gravity model for enhanced accuracy.Therefore, they are development trends in the field of complex networks in the future.
There are various topology analysis methods for existing networks, but key node identification methods often only focus on local attributes or global attributes; however, it is difficult to take into account both at the same time.According to Burt's structural hole theory 31 , the structural position of a node in a social network is more important than the corresponding strength of external relationships, since better structural positions have more information, resources, and power.Location advantages in social networks include local advantages and global advantages.The former can be quantified using local structural information, while the latter is determined by global topological connections.For this reason, comprehensive analysis of local and global attributes is crucial to evaluate the importance of complex network nodes.For this purpose, we propose a comprehensive importance indicator as a powerful tool for evaluating the importance of network nodes.The main contributions of this paper are summarized as follows: (1) Based on the Salton indicator, a weakly connected network constraint coefficient is constructed, and the local influence indicator is then refined.(2) Based on the improved K-Shell decomposition method, a hierarchical tenacity global coefficient is constructed, and the global influence indicator is refined.(3) By integrating the constraint coefficient of weakly connected network and hierarchical tenacity global coefficient, a comprehensive identification algorithm for local and global attributes is proposed.Experimental results show that the proposed algorithm outperforms many existing algorithms on real network datasets.
This paper is organized as follows.In Part 2, a hierarchical comprehensive node importance identification algorithm is proposed and classic node importance identification algorithms are briefly introduced.In Part 3, evaluation indicators are introduced to measure the performance of each algorithm.In Part 4, real network datasets are introduced for experiments.In Part 5, the comprehensive importance identification algorithm and other classic identification algorithms are tested on real network datasets.In Part 6, conclusion is drawn.

Construction of hierarchical tenacity global coefficient
Consider an undirected topological graph G = (V , E) , where the total number of nodes is N = |V | and the total number of edges is M = |E| .Define A as the adjacency matrix of the undirected network and a ij as the (i,j)-th entry of A. If node i is connected to node j, then the element a ij = 1 ; otherwise, a ij = 0 .For an undi- rected graph, it has a ij = a ji , a ii = 0 .Define Ŵ i as the set of neighbors of node i.Let k i represent the degree of node i, and e ij represents the edge between node i and node j.For the undirected graph shown in Fig. 1, it has N = 15, M = 19, k G = 6, a KM = 1 and a FD = 0.
The importance of network nodes will be analyzed by designing a method for identifying key nodes using local and global attributes.The identification algorithm is summarized into the following three steps: (1) Construct weakly connected network constraint coefficient based on Salton indicator.
(2) Construct hierarchical tenacity global coefficient based on the improved K-Shell decomposition method.
(3) Construct a comprehensive indicator of local and global attributes based on the normalization technique.
Calculate the comprehensive indicator of each node in the network and identify the importance of all the nodes.

Construction of weakly connected network constraint coefficient based on Salton indicator
This section quantifies the local attributes of each node.Structural hole theory provides a new perspective for understanding the local behavior of individuals.In fact, a structural hole is a gap between two disconnected nodes.When these two unconnected nodes are connected by a third node, the bridging node usually has more information advantages and control advantages.
To quantify the control advantages of bridge nodes, Burt introduced the network constraint coefficient NCC 31 .The NCC of node i is described as where p ij is the ratio of energy investment directly related to the given node i and node j, defined as As a local evaluation indicator of key nodes, NCC is usually negatively correlated with its importance in a given network.As the NCC decreases, the formation of structural holes is enhanced, and the importance of nodes increases.
The NCC of a node is calculated based on the node's neighborhood topology, including the number of neighbors of the node and the corresponding closeness between neighbors.However, NCC only collects the information of nearest neighbors and ignores the structural information of farther neighbors.In fact, NCC is ineffective when faced with nodes bridging the same number of non-redundant contacts.
For example, in Fig. 1, nodes C and F serve as bridges for node pairs (G, H) and (L, G) respectively.Nodes C and F have the same NCC, i.e.
That is, the two nodes have the same local influence.However, it can be seen from the figure that, although nodes C and F have the same NCC, node C has higher-order neighbors and stronger propagation ability.Therefore, NCC cannot accurately quantify the difference between node C and node F in this network.
The above analysis shows that NCC only collects information from the nearest neighbors, which results in less accurate identification of local features of nodes.In order to improve the accuracy of the method, more local structural information needs to be considered.Therefore, we propose an improved weakly connected network constraint coefficient WNCC.
(1)  www.nature.com/scientificreports/Kleinberg 32 points out that the strength of the connection between two people depends on the size of their shared social circle.When two social circles overlap, the power between them increases.Onnela et al. 33 studied that weak connections often serve as connectors among different communities and are of great significance to the overall connectivity of the network.Commonly used indicators to measure the effect of weak connections include Salton indicator S ij and Jaccard indicator J ij 34 , which are defined respectively as The Salton indicator S ij and the Jaccard indicator J ij represent the degree of local overlap of adjacent nodes.The lower the degree of overlap, the stronger the weak connectivity.Obviously, the greater the number of weak connections associated with a node, the more important the node's role in maintaining network connectivity.
For example, as shown in the left diagram in Fig. 3, node M is located on the shortest path between its neighbors A, B and C, and there is no direct connection between its three neighbor nodes.Therefore, the information transferred between nodes A, B, C and the cluster to which they belong will strongly depend on the link to which they are connected to node M. For node N in the right figure, its importance in maintaining network connectivity is significantly lower than that of node M due to the existence of alternative communication channels within its neighborhood.
Inspired by Salton indicator and Jaccard indicator, the weak connection coefficient w is designed as an indica- tor to measure the impact of node's high-order neighbor structure information on the node propagation ability, which is defined as where nodes i and j are neighbor nodes of each other.S ij (k i − 1) × (k j − 1) + 1 = 0 when S ij = 0 , that is, when the intersection of node i and neighbor node j is an empty set, the propagation ability is evaluated based on the degrees of the two nodes.Ŵ(i) ∪ Ŵ(j) − 1 is to eliminate the influence of nodes i and j themselves on the union of their neighbor nodes, and to eliminate the possibility of the denominator being 0. k i − 1 and k j − 1 are to eliminate the influence of nodes i and j on each other's neighbor nodes.The weak connection coefficient w satisfies w ij = w ji .
Based on the weak connection coefficient w , the network constraint coefficient NCC is improved and the weakly connected network constraint coefficient WNCC is proposed, which is defined as WNCC considers the structural information of distant neighbors and refines the local influence indicator.For Fig. 1, under the WNCC indicator, it has Therefore, node C with stronger local importance has a smaller WNCC than node F. (3) .
(5) www.nature.com/scientificreports/According to Table 1, the WNCC values of the remaining nodes in the example network in Fig. 1 exhibit a clearer hierarchy than the corresponding NCC values.Therefore, WNCC is more effective than the local NCC indicator.

Construction of hierarchical tenacity global coefficient based on improved K-Shell decomposition method
This section quantifies the global attributes of each node.Often, influential nodes also play a crucial role in maintaining network connectivity.If these most influential nodes are removed or do not participate in the propagation process, the final propagation scope and propagation efficiency will be reduced.Therefore, the global performance of nodes should be considered in terms of maintaining network connectivity and facilitating information flow.
Generally speaking, if removing a node results in more components and smaller connected components in the network, then the removed node is important to maintain network connectivity.To measure the vulnerability of a given network, Cozzens et al. 35 proposed the concept of tenacity.Before defining tenacity, the concept of cut set will be firstly explained.
Suppose S is a subset of the edge set E of the graph G , and the deletion of all the edges of S cause the connected graph G , G − S to be unconnected.If there is no subset of S that can cause the unconnection of G − S , then the edge set S is said to be a cut set of the graph G.
In the graph G shown in Fig. 4, S 1 = {c, d, f , g} and S 2 = {b, c, f } are two different subsets of the edge set E. For subset S 1 , since G − S 1 is unconnected after deleting all edges in the set, and there is no proper subset of S 1  www.nature.com/scientificreports/ that makes G − S 1 disconnected, edge set S 1 is a cut set of graph G .After deleting all edges of subset S 2 , G − S 2 is still connected, so the edge set S 2 is not a cut set of graph G.
By combining the criteria of network damage cost, number of components and maximum connected component size, tenacity T is defined as where A is the cut set of graph G , and τ (G − A) is the number of nodes of the maximum connected subgraph of undirected graph G − A , which represents the size of the connected component after removing the edge set.ω(G − A) is the number of connected subgraphs of the undirected graph G − A , which represents the number of connected components after removing the edge set.
Tenacity T can intuitively represent the decomposition ability of a connected graph after removing a certain part.When the number of removed edges is small, for some important nodes at the edge, even though they are directly connected to many nodes in the network, the topology is not destroyed after removing the connecting edges of the node.This result is consistent with the removal of many isolated nodes at the edge.
Following the calculation method of tenacity by removing edge sets, we define the tenacity of node i as where τ (G − i) is the number of nodes of the largest connected subgraph of the undirected graph G − i , and ω(G − i) is the number of connected subgraphs of the undirected graph G − i .Obviously, when τ (G − i) is smaller and ω(G − i) is larger, the removed edge set becomes more important in maintaining network connectivity.For example, in Fig. 1, nodes A and M serve as boundary nodes in the undirected topology network.After removing two nodes from the original network, nodes A and M have the same T value, that is That is, both nodes have the same global impact.However, it can be seen from the figure that the number of nodes connected to node A is significantly higher than that of node M. Therefore, although nodes A and M have the same T value, node A has a stronger propagation ability.Therefore, tenacity T cannot accurately quantify the difference between node A and node M in the above sample network.
The above analysis shows that tenacity T only considers the ability of node removal to split the network, which leads to inaccurate identification of the global characteristics.In order to improve the accuracy of the method, more global structural information needs to be considered.Therefore, we propose an improved hierarchical tenacity global coefficient HTGC.
The K-Shell decomposition method 36 is a coarse-grained node importance classification method that divides the network layer by layer from boundary to core based on node location information.The K-Shell value reflects the global position of the node in the network.The larger the K-Shell value, the more central the node's position and the more important the node is.The steps of K-Shell decomposition method are as follows: Step 1 Calculate the degrees of all nodes in the network, take the degree of the smallest node and record it as KS, which is the K-Shell value.
Step 2 Delete all nodes with degree KS in the network, update the network and recalculate the degree value, recursively delete nodes with degree less than or equal to KS until the node degrees in the network are greater than KS.Mark all deleted nodes as KS.
Step 3 Repeat the above steps until all nodes in the network are stripped.Mark the K-Shell value.
Figure 5 shows a network containing 17 nodes and 21 edges which will be used to explain the steps of the K-Shell decomposition method.In this network, in the process of KS rising from 1 to 3, the nodes from the outermost layer to the innermost layer in the network are marked respectively.It is not difficult to see that as the core status of a node increases in the network, its K-Shell value also increases accordingly.
However, using K-Shell value to represent the importance of a node is too rough, and a large number of nodes with obvious structural and functional differences have the same K-Shell value.In the refinement process of nodes with the same K-Shell value, the actual degree of the node can be used to determine the position information of the node in the same shell.As an improvement of the K-Shell decomposition process, the improved K-Shell value IKS is defined as where KS i is the K-Shell value of node i, KS i|next is the K-Shell value of the node in the next layer of i (if i is in the deepest layer, the default is KS i|next = KS i + 1 ), k i is the degree of node i, k i|max is the maximum degree of the nodes in the same layer as node i.
According to Table 2, it is not difficult to conclude that KS i < IKS i < KS i|next , so the improved K-Shell value IKS is a further refinement of the global attributes of nodes with the same K-Shell value, which can further distinguish the importance of nodes.For the K-Shell value KS and the improved K-Shell value IKS, the larger the value, the deeper the node's hierarchical position, and the higher the global importance of the node.

Construction of comprehensive indicator of local and global attributes based on the normalization method
In complex network analysis, assessing the importance of nodes is a multidimensional problem.It is often impossible to fully reveal the true role and status of nodes from a single local or global perspective.Local The deeper the node is in the network, the later it will be stripped, and it will have a larger K-Shell value.
Table 2. KS and IKS values of the example network nodes in Fig. 5.An effective comprehensive indicator should be able to combine these two aspects.To this end, the weakly connected network constraint coefficient WNCC and the hierarchical tenacity global coefficient HTGC are combined to yield the hierarchical comprehensive importance coefficient HCIC, which is defined as where CL i and CG i are the normalized weakly connected network constraint coefficient and hierarchical tenacity global coefficient of the node i, which are defined respectively as follows Algorithm 1 shows the pseudo code for calculating the hierarchical comprehensive importance coefficient HCIC of node i.According to the above algorithm, nodes with lower HCIC values have greater impact on maintaining network connectivity, so that the corresponding nodes are more important.
Using such a comprehensive indicator, we can not only evaluate the importance of nodes more comprehensively, but also better understand and predict the dynamic behavior and evolution trends of complex networks.This is of great significance to many fields of network science, such as social network analysis, bioinformatics, and information dissemination.

Classic benchmark algorithm
We use several classic benchmark algorithms to compare the performance of the proposed method, including: (1) Degree centrality (DC) algorithm Degree centrality 37 is a basic identification algorithm for identifying the importance of nodes.The degree of node i is defined as (2) Collective influence (CI) algorithm The collective influence 38 of node i is defined as where set(i, l) represents the set of all nodes whose distances from node i are less than l.
(3) WL algorithm WL algorithm 39 is an identification method based on node degree and adjacent node degree, which is defined as ( 14) The DWT algorithm 40 is a method that quantifies link strength based on local information of network topology and evaluates the importance of nodes based on the number of connections and overlap degree of neighbor nodes, which is defined as where S ij is the Salton indicator of node i and node j.
(5) K-Shell decomposition method The K-Shell decomposition method is a coarse-grained node importance identification algorithm that divides the network layer by layer from boundary to core based on node location information.The implementation steps of this method have been introduced above.
(6) KPD algorithm The KPD algorithm 41 is an improved algorithm based on the K-Shell decomposition method, which is defined as where KS i is the K-Shell value of node i, l i is the stripping order of node i in the same layer, and l max,i is the maximum stripping order of node i in the same layer. (

7) INCC algorithm
The INCC algorithm 42 combines the direct and indirect effects of the nearest neighbors and second-nearest which is defined as where p ij is the proportion of energy investment directly related to the node i and node j. (8) Random algorithm A random algorithm ranks the importance of network nodes through random scoring.
(9) CIM algorithm The CIM algorithm 43 is a method for identifying key nodes in complex networks based on the global structure.It constructs a comprehensive influence matrix from three aspects: shortest path length, shortest path number and non-shortest path number to reflect the influence between nodes, which is defined as where CM is the comprehensive influence matrix.
(10) GLS algorithm The GLS algorithm 44 also considers both the local and global structures of the network, which is defined as where GI i and LI i are respectively the global influence and local influence of node i.

Evaluation indicators
In the above content, we analyzed the local attributes and global attributes of complex network topology nodes, and designed two evaluation indicators: weakly connected network constraint coefficient WNCC and hierarchical tenacity global coefficient HTGC.The above two types of indicators are normalized and integrated, and the hierarchical comprehensive importance coefficient HCIC is proposed as an evaluation indicator for the network node importance.
In order to verify the rationality of the HCIC identification algorithm, other classic importance identification algorithms will be compared, and a comparative experiment will be designed to validate the HCIC algorithm based on different evaluation indicators.
The node importance is sorted in descending order according to the node importance ranking values generated by different algorithms.The experiment evaluates the advantages and disadvantages of different node importance identification algorithms by comparing the connectivity properties of the remaining subgraphs after removing nodes with a certain importance proportion by different algorithms.
To indicate the connectivity of the remaining subgraph after removing a number of important nodes, commonly used evaluation indicators include: (1) Maximum connectivity coefficient The maximum connectivity coefficient P Subset is an important indicator for evaluating the performance of the identification algorithm, which is defined as ( 16) The Hamrle2 dataset is a simulated circuit network, containing 5952 electrical nodes and 22162 circuit element edges.This dataset can be used to determine the voltage and current relationships over time at various points in the circuit.
The basic attributes of the network corresponding to each dataset are shown in Table 4.Where N is the num- ber of nodes in the network, M is the number of edges in the network, < k > is the average degree of network nodes, k max is the maximum degree of network nodes, C avg is the average clustering coefficient of the network, and d is the network density.
In this experiment, some classic identification algorithms are used as the reference objects of the HCIC algorithm, such as DC algorithm, CI algorithm, K-Shell decomposition method, INCC algorithm, random algorithm, etc.By implementing the above identification algorithms, the importance of each node in the undirected topology network can be intuitively compared.Usually, the importance is arranged in ascending or descending order according to different sorting indicators, and the specific ranking method associates with a specific ranking indicator.

Results and analysis
In order to intuitively reflect the impact of different node importance identification algorithms on network topology, we selected a small network email-enron-only dataset containing 143 nodes and 623 edges for testing.Use the WL, DWT and HCIC algorithms, all nodes in the original network are ranked by importance.After deleting the top 20% of nodes in importance, calculate the number of nodes contained in the maximum connected subgraph of each remaining network.
As shown in Fig. 6, the yellow area represents the maximum connected subgraph after deleting nodes.The maximum connected subgraph sizes after using the WL, DWT and HCIC algorithms account for 76.92%, 70.63% and 57.34% of the number of original network nodes respectively.This can also be reflected in the size of the yellow area in the figure.Therefore, after deleting the same proportion of important nodes, our proposed algorithm can accelerate the decomposition of the connectivity degree of the original network and can better identify nodes with greater importance in the network.Next, different identification algorithms are experimentally verified.

Maximum connectivity coefficient
The experimental results of the maximum connectivity coefficient are shown in Fig. 7.The maximum connectivity coefficient P Subset reflects the proportion of the maximum connected subgraph after removing nodes in the original network.
Table 5 shows the maximum connectivity coefficient P Subset of each algorithm after deleting the top 10% nodes of importance.The corresponding maximum connectivity coefficient of the HCIC algorithm is the smallest among the six datasets.This shows that after removing the first 10% of important nodes identified using the HCIC algorithm, the remaining largest connected subgraph becomes much smaller.Therefore, the key nodes identified by the HCIC algorithm can play a key role in the stability of the network structure.
The maximum connectivity coefficient P Subset can also be used as an indicator of network robustness analysis.In Table 5, after using the same algorithm to remove the top 10% of nodes in importance, the ratio of the remaining largest connected subgraph in the tech-routers-rf dataset is the highest among the experimental results of the five algorithms, and the experimental results of the remaining four algorithms are second only to the ukerbe1 dataset.This shows that the network is able to maintain a larger connected subgraph even when key nodes are removed, showing greater resistance to interference and node failures.
It can be seen from Table 5 that when using the ukerbe1 dataset for experiments, the HCIC algorithm can demonstrate a maximum connected subgraph destruction effect that is significantly better than other algorithms.For the node with the maximum degree in the network, if the degree of the node is small, then the HCIC algorithm can delete the maximum connected subgraph at a relatively average speed.

Average remaining edges of the network
The experimental results of the average remaining edges of the network are shown in Fig. 8.The average remaining edges of the network P Edges reflects the proportion of the remaining edges in the original network after removing the nodes.Table 6 shows the average remaining edges P Edges of the network after removing the top 10% nodes of importance for each algorithm.The experimental results of the network average remaining edges corresponding to the HCIC algorithm are the lowest among the four datasets, and the experimental results in the remaining two datasets are slightly different from the lowest values.This shows that after removing the first 10% of important nodes identified using the HCIC algorithm, the number of remaining edges becomes much smaller.Therefore, the HCIC algorithm has a stronger ability to identify vulnerable nodes in the network than other algorithms.
Similar to the maximum connectivity coefficient P Subset , the average remaining edges P Edges can also be used as an indicator for network robustness analysis.In Table 7, by applying the same algorithm to remove the top 10% of nodes in importance, the ratio of the remaining edges in the ukerbe1 dataset is the highest among the experimental results of eight algorithms, and is not the maximum only in the random algorithm that sorts the importance of network nodes through random scoring.This shows that the network can maintain as many edges as possible even when key nodes are removed, and can better adapt to dynamic changes in nodes without affecting overall performance.
It can be seen from Table 6 that although the HCIC algorithm ranks one of the highest among all test algorithms in removing the number of network edges, when conducting experiments using the bn-fly-drosoph-ila_medulla dataset, the algorithm's destruction effect on the edges in the network is not much different from other algorithms.For networks with high average node degrees, the HCIC algorithm may not be able to quickly

Network sensitivity
The experimental results of network sensitivity are shown in Fig. 9.The sensitivity indicator S reflects the degree to which the original network is decomposed during the removal of nodes.Table 7 shows the node removal ratio p for peak sensitivity in different datasets.The corresponding removal ratios of the HCIC algorithm in the six datasets are the lowest among all algorithms, implying that the original network is decomposed into segments less than or equal to the threshold σ to the maximum extent after remov- ing a small number of important nodes.Therefore, the important nodes identified by the HCIC algorithm are more important in protecting network integrity and stability.
It can be seen from Table 7 that when using the ukerbe1 dataset for experiments, the HCIC algorithm can make the network reach the highest sensitivity after deleting a very small proportion of important nodes, while the proportion of nodes that need to be deleted using other algorithms is much higher than this algorithm.For the node with the maximum degree in the network, if the degree of the node is small, it is easier for the HCIC algorithm to identify the important nodes, so that the network can be decomposed to the greatest extent after deleting these nodes.
By adjusting the node removal ratio, we can also determine the stability state of the network under specific conditions.This helps optimize the structure of the network so that it exhibits better stability in the face of node removal or other external disturbances.

Network monotonicity
The experimental results of network monotonicity are shown in Table 8.The monotonicity indicator m can reflect the ability of the identification algorithm to distinguish the importance of nodes.
For the six datasets used in the experiment, the HCIC algorithm demonstrates the best network monotonicity in five datasets, and the monotonicity is slightly lower than the CI algorithm only in the tech-routers-rf dataset.Therefore, the node importance identification algorithm we proposed can provide unique ranking indicators for most nodes in the network at a high resolution.
As can be seen from Table 8, when using the p2p-Gnutella08 dataset for experiments, the effect of using the HCIC algorithm to distinguish network importance is not as good as other datasets.Considering that P2P networks are usually designed as decentralized networks, this means that there is no fixed central node or server in the network, and each node can act as a client and server.This design makes the function and importance of  Algorithms with better monotonicity can ensure more reasonable node ordering, thereby improving the accuracy and effectiveness of decision-making.By properly ranking nodes, the system can also better respond to node failures or network abnormalities, achieving improved fault tolerance.

Comparative experiments with other local and global attribute algorithms
In order to verify the effectiveness of our proposed HCIC algorithm in considering both local and global attributes in complex networks, we use the Hamrle2 dataset to conduct comparative experiments on the HCIC algorithm and our proposed WNCC and HTGC algorithms to compare the differences between algorithm that integrates local and global attributes and algorithms that only improve at the local or global attribute level.At the same time, we also use CIM and GLS, two effective key node identification algorithms, in the comparison algorithm.They comprehensively consider local and global attributes at the level of network information transmission efficiency.
Figure 10 shows the experimental results of the comparative experiment on indicators such as maximum connectivity coefficient, average remaining edges of the network, and network sensitivity.Table 9 shows the maximum connectivity coefficient and the average remaining edges of the network after removing the top 10% of important nodes, as well as the node removal proportion and network monotonicity where peak sensitivity  occurs.Overall, the experimental results of the HCIC algorithm are better than the other four algorithms in terms of various indicators.This shows that our proposed algorithm is superior when integrating local and global attributes, and is better than when local and global attributes are considered separately.At the same time, the experimental results using the HTGC algorithm are better than those of WNCC.It can be inferred that global attributes have a higher degree of influence in the HCIC algorithm than local attributes.

Conclusion
This paper aims to evaluate the importance of complex network nodes through comprehensive analysis of local and global attributes.To this end, we combine the weakly connected network constraint coefficient and the hierarchical tenacity global coefficient, and propose the HCIC algorithm as a powerful tool for identifying the importance of network nodes.By comparing with other classic identification algorithms on real network data sets, experimental results show that the important nodes identified by the HCIC algorithm can yield better stability and sensitivity of the network structure.Meanwhile, this algorithm can also provide unique ranking indicators for most nodes in the network at a high resolution.
With the continuous growth of large-scale networks, network node importance identification algorithms need to better adapt to complex and dynamic network topologies.Future research directions may include introducing more flexible models to better capture the correlations and evolutionary trends between nodes.In addition, as the network security becomes increasingly concerned, node importance identification algorithms should pay more attention to adversarial attacks and robustness.Researchers may explore how to maintain network stability and reliability in the face of node failures or malicious attacks.Overall, the development of network node importance identification algorithms will continue to focus on improving the intelligence, adaptability and robustness of the algorithm to meet the needs of increasingly complex and diverse network environments.

Figure 2 .
Figure 2. A flow chart for realizing node importance identification in a network.The input is a certain network topology, and the output is the ranking result of node importance.

Figure 3 .
Figure 3. Example network illustrating the effect of weak connections.In the left figure, the information transmission between the neighbors of node M strongly depends on the path connecting them to node M. In the figure on the right, the neighbors of node N can communicate directly through the connection paths between them.

Figure 4 .
Figure 4. Schematic diagram illustrating the concept of cut sets.For the subsets S 1 and S 2 of the edge set E of graph G , after deleting the edges contained in the subsets respectively, it can be judged according to the definition whether they can become cut sets of graph G.
0123456789) Scientific Reports | (2024) 14:12039 | https://doi.org/10.1038/s41598-024-62895-2www.nature.com/scientificreports/Based on the improved K-Shell value IKS, the tenacity T is improved by proposing the hierarchical tenacity global coefficient HTGC as follows HTGC considers the hierarchical structure information of different nodes and refines the global influence indicator.For Fig. 1, HTGC indicators of node A and M satisfy Therefore, node A with stronger global importance has a smaller HTGC than node M. According to Table 3, the HTGC values of the remaining nodes in the network of Fig. 1 also have a clearer hierarchy than the corresponding tenacity T values.Therefore, the HTGC value is an effective improvement on the global tenacity T indicator.

Figure 5 .
Figure 5. Schematic diagram illustrating the steps of the K-Shell decomposition method.The deeper the node is in the network, the later it will be stripped, and it will have a larger K-Shell value.

Figure 6 .
Figure 6.A small network used to reflect the impact of different node importance identification algorithms on network connectivity.The upper left picture shows the original network topology.The yellow areas in the remaining three pictures are the largest connected subgraph after using WL, DWT and HCIC algorithms to sort node importance and delete the top 20% of nodes.

Figure 7 .
Figure 7.The maximum connectivity coefficient corresponding to different networks after removing a certain proportion of important nodes.The abscissa represents the proportion of nodes removed after being sorted in descending order of importance, and the ordinate represents the corresponding maximum connectivity coefficient P Subset . (a) bio-DM-HT, (b) bn-fly-drosophila_medulla, (c) CL-10000-2d0-trial3, (d) p2p-Gnutella08, (e) tech-routers-rf, (f) ukerbe1.

Figure 9 .
Figure 9.The corresponding network sensitivity of different networks after removing a certain proportion of important nodes.

Figure 10 .
Figure 10.Experimental results of maximum connectivity coefficient, network average remaining edges and network sensitivity using Hamrle2 dataset.

Table 1 .
NCC and WNCC values of the example network nodes in Fig.1.
attributes reflect a node's direct influence within its neighborhood and micro-position in the network, while global attributes reveal a node's influence and macro-position in the entire network structure.Therefore, developing a comprehensive indicator that integrates local and global attributes is crucial for a deep understanding of the comprehensive importance of nodes.

Table 4 .
Basic attributes of the network corresponding to each dataset.

Table 5 .
Maximum connectivity coefficient P Subset after deleting the top 10% nodes of importance.Significant values are in bold.The average remaining edges of different networks after removing a certain proportion of important nodes from different networks.

Table 6 .
Average remaining edges of the network P Edges after deleting the top 10% nodes of importance.Significant values are in bold.

Table 7 .
The node removal ratio p when peak sensitivity occurs.Significant values are in bold.each node relatively uniform, without obvious hierarchical structure or centralized features.This type of network usually has difficulty in identifying the importance of nodes with extremely high discrimination.

Table 8 .
Experimental results of network monotonicity m.Significant values are in bold.

Table 9 .
Experimental results of maximum connectivity coefficient, network average remaining edges, network sensitivity and network monotonicity using Hamrle2 dataset.Significant values are in bold.